class: center, middle, inverse, title-slide .title[ # ISA 444/544: Business Forecasting ] .subtitle[ ## 15: Stationarity, Differencing and Unit Root Tests ] .author[ ###
Fadel M. Megahed, PhD
Professor
Farmer School of Business
Miami University
@FadelMegahed
fmegahed
fmegahed@miamioh.edu
Automated Scheduler for Office Hours
] .date[ ### Fall 2025 ] --- ## Learning Objectives for Today's Class - Define a **stationary time series**. - Understand how plotting techniques can help identify non-stationarity. - Understand how we can **stabilize a non-stationary time series to achieve stationarity** (differencing to stabilize the mean, and log transformation to stabilize the variance). - Understand how to use **unit root tests** to test for stationarity. --- class: inverse, center, middle # What is Stationarity? --- ## Stationarity .content-box-gray[ **Definition**: If `\(\{y_t\}\)` is a **stationary time series**, then for all `\(s\)`, the distribution of `\((y_t,\dots,y_{t+s})\)` does not depend on `\(t\)`. ] <img src="data:image/png;base64,#../../figures/time_series_animation_stationarity.gif" alt="Animated illustration of a stationary time series with highlighted windows and statistics." width="100%" style="display: block; margin: auto;" /> --- count:false ## Stationarity .content-box-gray[ **Definition**: If `\(\{y_t\}\)` is a **stationary time series**, then for all `\(s\)`, the distribution of `\((y_t,\dots,y_{t+s})\)` does not depend on `\(t\)`. ] - This means that a **stationary time series**: - is roughly horizontal (i.e., no trend) - has a constant variance (i.e., no heteroskedasticity) - no patterns are predictable in the long-term (i.e., no seasonality) --- ## Class Activity: Stationarity or Non-Stationarity? .panelset[ .panel[.panel-name[Task] In a **Kahoot**, we will explore various time series data and determine whether they are stationary or non-stationary. ] .panel[.panel-name[KR] <img src="data:image/png;base64,#15_stationarity_files/figure-html/kroger-1.png" alt="Kroger's Stock Price Last Year" width="100%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Δ KR] <img src="data:image/png;base64,#15_stationarity_files/figure-html/delta_kroger-1.png" alt="Kroger's Stock Price Last Year" width="100%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Strikes] <img src="data:image/png;base64,#15_stationarity_files/figure-html/strikes-1.png" alt="A line plot showing the number of workers involved in strikes in the U.S. from 1947 to 2024." width="100%" style="display: block; margin: auto;" /> ] .panel[.panel-name[House Sales] <img src="data:image/png;base64,#15_stationarity_files/figure-html/house_sales-1.png" alt="A line plot showing the number of monthly new home sales in the U.S." width="100%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Lynx] <img src="data:image/png;base64,#15_stationarity_files/figure-html/lynx-1.png" alt="A line plot showing the number of lynx trappings in Canada from 1821 to 1934." width="100%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Beer] <img src="data:image/png;base64,#15_stationarity_files/figure-html/beer-1.png" alt="A line plot showing Australias quarterly beer production." width="100%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Travel] <img src="data:image/png;base64,#15_stationarity_files/figure-html/air_passengers-1.png" alt="A line plot showing the number of monthly airline passengers from 1949 to 1960." width="100%" style="display: block; margin: auto;" /> ] ] --- ## How to Identify Non-Stationarity in the Mean - **Visual Inspection** of the **time series plot (line-chart)**: - Look for trends in the time series plot. - Look for seasonality in the time series plot. - Examine the **ACF** of the time series: + A **slow decay** in the ACF plot `\(\longrightarrow\)` **non-stationary**. + On the other hand, if the ACF plot **drops to zero quickly** `\(\longrightarrow\)` **likely stationary**. + **Note:** For non-stationary time series, the **value of `\(r_1\)` is often large and positive**. --- ## Example: The Non-Stationarity in Kroger Stock Price .left-code[ .font90[ ``` python import pandas as pd import datetime as dt import seaborn as sns path = "../../data/kroger_stock_2024.csv" kr = ( pd.read_csv(path) .assign( ds=lambda x: pd.to_datetime(x['date']), m=lambda x: x['ds'].dt.month_name().str[:3], ) ) plt.figure(figsize=(8, 5)) ax = sns.lineplot( data=kr, x='ds', y='price', color='black' ) sns.scatterplot( data = kr, x = 'ds', y = 'price', hue = 'm', s=40, ax=ax, palette='Paired' ) ax.set( title="Kroger's Daily Adjusted Closing Price", xlabel="Date", ylabel="Adjusted Daily Closing Price", ) ``` ] ] .right-plot[ <br><br> <img src="data:image/png;base64,#15_stationarity_files/figure-html/kroger1_out-2.png" alt="Kroger's Stock Price Last Year" width="100%" style="display: block; margin: auto;" /> ] --- ## Example: The Non-Stationarity in Kroger Stock Price .left-code[ .font90[ ``` python from statsmodels.tsa.stattools import acf from statsmodels.graphics.tsaplots import plot_acf # Using the data from Slide 7 acf_df = pd.DataFrame( {'lag': range(0, 21), 'acf': acf(kr['price'], nlags=20) } ) # Print the first few rows of the ACF df acf_df.head() # Plot the ACF and beautify the plot plt.figure( figsize=(8, 5) ) plot_acf( kr['price'], lags=20 ) plt.xticks( ticks=range(0, 21) ) plt.xlabel('Lag') plt.ylabel('ACF') plt.title("ACF of Kroger's Stock Price") plt.show() ``` ] ] .right-plot[ .font80[ ``` ## lag acf ## 0 0 1.0000 ## 1 1 0.9737 ## 2 2 0.9467 ## 3 3 0.9189 ## 4 4 0.8954 ``` <img src="data:image/png;base64,#15_stationarity_files/figure-html/kroger2_out-4.png" alt="ACF of Kroger's Stock Price" width="100%" style="display: block; margin: auto;" /><img src="data:image/png;base64,#15_stationarity_files/figure-html/kroger2_out-5.png" alt="ACF of Kroger's Stock Price" width="100%" style="display: block; margin: auto;" /> ] ] --- ## How Does the ACF of a Stationary Time Series Look? - For a **stationary time series**, the **ACF** plot will **decay (cut off) quickly** to zero. + Non-significant autocorrelations are within the **blue shaded region**, and are **statistically not different from zero**. .left-code[ .font90[ ``` python *kr['delta'] = kr['price'].diff() acf_df = pd.DataFrame( {'lag': range(0, 21), 'acf': acf( * kr['delta'], nlags=20, missing='drop' ).round(4) } ) acf_df.head() plt.figure( figsize=(8, 5) ) *plot_acf(kr['delta'], lags=20, missing='drop') plt.xticks( ticks=range(0, 21) ) plt.xlabel('Lag') plt.ylabel('ACF') plt.title("ACF of Kroger's Diff in Stock Price") ``` ] ] .right-plot[ .font80[ ``` ## lag acf ## 0 0 1.0000 ## 1 1 -0.0274 ## 2 2 -0.0382 ## 3 3 -0.1016 ## 4 4 -0.0387 ``` ] <img src="data:image/png;base64,#15_stationarity_files/figure-html/krog_diff_out-8.png" alt="ACF of Kroger's Change in Stock Price" width="100%" style="display: block; margin: auto;" /> ] --- class: inverse, center, middle # Mathematical Background on the 95% Limits of the ACF Plot # (Understanding how `statsmodels` computes confidence intervals) --- ## 1. Standard Error of ACF for White Noise For an **uncorrelated white noise** process, the standard error (SE) of the sample ACF at lag \( k \) is: `$$\text{SE}(\hat{\rho}_k) \approx \frac{1}{\sqrt{N}}$$` - Here, \( N \) is the **sample size**. - Assumes the series is **white noise**: no autocorrelation beyond lag 0. - The **95% confidence interval (CI)** is: `$$\pm 1.96 \times \underbrace{\frac{1}{\sqrt{N}}}_{\color{black}{\text{Standard Error (SE)}}}$$` 💡 **Key idea:** More data `\(\longrightarrow\)` **Smaller SE** `\(\longrightarrow\)` **Narrower confidence bands**. ??? **Speaker Notes:** - The standard error of the ACF is inversely proportional to the square root of \( N \). - This formula applies under the assumption that the time series is **white noise**. - The confidence intervals are computed as **\( \pm 1.96 \times SE \)**, meaning that the limits shrink when \( N \) is large. - However, real-world time series often **exhibit autocorrelation**, requiring a more complex SE formula. --- ## 2. Bartlett’s Formula for SE of ACF For a **stationary** time series, the **SE must account for autocorrelation**: `$$\text{SE}(\hat{\rho}_k) \approx \sqrt{ \frac{1}{N} \left( 1 + 2 \sum_{j=1}^{k-1} \hat{\rho}_j^2 \right) }$$` The **sum of squared autocorrelations**, `\(\underbrace{2 \sum_{j=1}^{k-1} \hat{\rho}_j^2}_{\color{black}{\text{Effect of past lags}}}\)`, increases SE, which leads to **wider confidence bands**. **Hence:** - If the time series is **highly correlated** (e.g., a **random walk**), SE **increases**. - **Differencing reduces autocorrelation**, making the sum **smaller** and the SE **lower**. ??? **Speaker Notes:** - In **real-world** time series, past values influence future values, which means autocorrelation exists. - Bartlett’s formula modifies the white noise SE formula by incorporating **autocorrelation at prior lags**. - The key takeaway is that **higher autocorrelation increases SE**, which **widens** the confidence intervals. - If the time series is highly persistent (like a random walk), the CI will be **much wider**. - Differencing **reduces autocorrelation**, leading to **smaller** SE and **narrower** CIs. --- ## 3. Why Do CIs Widen at Larger Lags? Even though **autocorrelations shrink at higher lags**, the confidence intervals still widen. .black[.bold[Why?]] `$$\text{SE}(\hat{\rho}_k) \approx \sqrt{ \frac{1}{N} \left( 1 + 2 \sum_{j=1}^{k-1} \hat{\rho}_j^2 \right) }$$` <br> **Larger lags** `\(\longrightarrow\)` `\((\underbrace{1 + 2 \sum_{j=1}^{k-1} \hat{\rho}_j^2}_{\color{black}{\text{Cumulative sum increases}}})\)` `\(\longrightarrow\)` **Higher SE**. **Fewer overlapping observations** at higher lags `\(\longrightarrow\)` **Increased uncertainty**. <br> **Result:** Confidence intervals **widen at larger lags**, even if `\(\hat{\rho}_k\)` is small. ??? **Speaker Notes:** - Many assume that since \( \hat{\rho}_k \) shrinks at larger lags, the CI should also shrink, but the opposite happens. - The **cumulative sum** of past autocorrelations **grows** in Bartlett’s formula, leading to **larger SE**. - Another reason: At higher lags, **fewer pairs of observations** contribute to the estimate, increasing uncertainty. - This is why **ACF plots often show wider CIs for larger lags**, even if \( \hat{\rho}_k \) is close to zero. --- ## 4. Impact of Differencing on Confidence Limits When **differencing** a time series: `$$Y_t = X_t - X_{t-1}$$` - **Reduces autocorrelation**, shrinking the **Bartlett sum** in SE computation: `$$\underbrace{\sum_{j=1}^{k-1} \hat{\rho}_j^2}_{\color{black}{\text{smaller after differencing}}}$$` - **Decreases SE**, leading to **narrower CIs** compared to the original series. - Explains **why** `plot_acf()` **shows:** - .black[.bold[Wider limits for a trending or non-stationary series.]] - .black[.bold[Narrower limits after differencing.]] ??? **Speaker Notes:** - Differencing **transforms** a highly autocorrelated series into a more **stationary** one. - This has a direct impact on **Bartlett’s formula**: The sum of past autocorrelations **shrinks**, lowering SE. - Lower SE `\(\longrightarrow\)` **Narrower confidence intervals**. - This explains why: - A **non-stationary series (random walk)** has **wide CIs**. - A **stationary differenced series** has **narrower CIs**. --- ## 🔑 Key Takeaways ✅ Original series with .black[.bold[strong autocorrelation]] `\(\longrightarrow\)` **Wider confidence bands** ✅ Differenced series with .black[.bold[weaker autocorrelation]] `\(\longrightarrow\)` **Narrower confidence bands** ✅ **CIs widen at larger lags** due to .black[.bold[accumulating estimation uncertainty]] ??? **Speaker Notes:** - To summarize: - If a time series has strong **autocorrelation**, its **confidence intervals are wide**. - If we **difference the series**, autocorrelation decreases `\(\longrightarrow\)` **narrower** CIs. - Higher lags **accumulate uncertainty**, which is why CIs expand for larger lags. - Understanding these statistical effects helps correctly **interpret ACF plots** in time series analysis. --- class: inverse, center, middle # Stabilizing Non-Stationary Time Series --- ## Stabilizing the Mean: Differencing - **Differencing** is a common technique to **stabilize the mean** of a time series. - The **first difference** of a time series is the **change** between consecutive observations in the original series: - `\(\Delta y_t = \underbrace{y'_t}_{\begin{array}{c} \text{First} \\ \text{Difference} \end{array}} = \underbrace{y_t - y_{t-1}}_{\begin{array}{c} \text{Current value} \\ \text{minus previous value} \end{array}}\)` - The **differenced series** will have only `\(N-1\)` observations since it is not possible to calculate a difference `\(y_1'\)` for the first observation. --- ## Second-order Differencing Occasionally the differenced data will not appear stationary and it may be necessary to difference the data a second time: `$$\begin{align} \underbrace{y''_{t}}_{\text{Second Difference}} & = \underbrace{y'_{t} - y'_{t - 1}}_{\text{First Difference of the First Difference}} \\ & = \underbrace{(y_t - y_{t-1})}_{\text{First Difference at } t} - \underbrace{(y_{t-1}-y_{t-2})}_{\text{First Difference at } t-1}\\ & = \underbrace{y_t - 2y_{t-1} +y_{t-2}}_{\text{Final expanded form}}. \end{align}$$` * `\(y_t''\)` will have `\(T-2\)` values. * In practice, it is unlikely to go beyond second-order differences (a different representation of the **data generation process** may be needed). --- ## Seasonal Differencing A **seasonal difference** is the difference between an observation and the corresponding observation from the previous year. `$$y'_t = y_t - y_{t-m}$$` where `\(m=\)` number of seasons. * For monthly data `\(m=12\)`. * For quarterly data `\(m=4\)`. * Seasonally differenced series will have .red[ `\(T-m\)` ] observations. --- ## Interpretation of Differencing - **First differences** are the change between **one observation and the next**. - **Seasonal differences** are the change between **one observation and the same observation in the previous year** (i.e., year over year changes or changes between one year to the next). --- ## Stabilizing the Variance: Log Transformation **Log transformation** is a common technique to **stabilize the variance** of a time series. <img src="data:image/png;base64,#../../figures/air_passengers_animation.gif" width="85%" style="display: block; margin: auto;" /> --- ## Stabilizing the Variance and the Mean: Log-Differencing **Log-differencing** is a combination of **log transformation** and **differencing**. <img src="data:image/png;base64,#../../figures/air_passengers_animation_log_diff.gif" width="85%" style="display: block; margin: auto;" /> --- ## One Final Step: DeSeasonalizing the Time Series <img src="data:image/png;base64,#../../figures/air_passengers_animation_log_diff_seas.gif" width="85%" style="display: block; margin: auto;" /> --- ## Order Does Not Matter: Seasonal and First Differences When both seasonal and first differences are applied `\(\dots\)` * it makes no difference which is done first---the result will be the same. If `\(y'_t = y_t - y_{t-12}\)` denotes seasonally differenced series, then twice-differenced series is `$$\begin{align} y^*_t &= y'_t - y'_{t-1} \\ &= (y_t - y_{t-12}) - (y_{t-1} - y_{t-13}) \\ &= y_t - y_{t-1} - y_{t-12} + y_{t-13}\: . \end{align}$$` This is equivalent to the first-differenced series of the seasonally differenced series: `$$\begin{align} y^*_t &= y'_t - y'_{t-1} \\ &= (y_t - y_{t-12}) - (y_{t-1} - y_{t-13}) \\ &= y_t - y_{t-1} - y_{t-12} + y_{t-13}\: . \end{align}$$` --- ## Why Do Seasonal Differencing First? Hyndman & Athanasopoulos (2022) recommend applying seasonal differencing first because: - If seasonal differencing alone makes the series stationary, we don not need further differencing. - If we first apply regular differencing, the seasonal pattern may remain strong, meaning an extra seasonal differencing step will still be necessary. - **Seasonal differencing is often more effective** at removing non-stationarity in data with **strong seasonality**. --- class: inverse, center, middle # Unit Root Tests --- ## Unit Root Tests: ADF Test - **Unit root tests** are used to determine whether a time series is **stationary** or **non-stationary**. - The most common unit root test is the **Augmented Dickey-Fuller (ADF) test**. - The null hypothesis of the ADF test is that the time series has a **unit root** (i.e., it is **non-stationary**). - If the null hypothesis is rejected (e.g., if `\(p < 0.05\)`), the time series is considered **stationary**. - The ADF test is available in Python through the `statsmodels` library, and can be accessed using the `adfuller()` function as follows: .center[ `from statsmodels.tsa.stattools import adfuller` ] **Note:** See [this link](https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.adfuller.html) for more information on the `adfuller()` function. --- ## Unit Root Tests: KPSS Test - The **Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test** is another unit root test used to determine whether a time series is **stationary** or **non-stationary**. - The null hypothesis of the KPSS test is that the time series is **stationary** around a **fixed level**. - If the null hypothesis is rejected (e.g., if `\(p < 0.05\)`), the time series is considered **non-stationary**. - The KPSS test is available in Python through the `statsmodels` library, and can be accessed using the `kpss()` function as follows: .center[ `from statsmodels.tsa.stattools import kpss` ] **Note:** See [this link](https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.kpss.html) for more information on the `kpss()` function. --- ## CoreForecast's Short Cut Based on R's `forecast` Package - In R, the `forecast` package provides a convenient function `ndiffs()` to determine the number of differences required to make a time series stationary. - This function uses the **KPSS test** to determine the number of differences required to make the time series stationary. - In [CoreForecast](https://nixtlaverse.nixtla.io/coreforecast/differences), we can use the `num_diffs(x: ndarray, max_d: int = 1) -> int` function to determine the number of differences required to make a time series stationary. Similarly, we can use `num_seas_diffs(x: ndarray, season_length: int, max_d: int = 1) → int` to determine the number of seasonal differences required to make a time series stationary. - **Reference:** [CoreForecast Documentation](https://nixtlaverse.nixtla.io/coreforecast/differences). --- ## Demo: Determining the Number of Differences Required Let us examine the [airpassengers](https://raw.githubusercontent.com/fmegahed/isa444/refs/heads/main/data/airpassengers.csv) dataset. Our goal is to make this dataset stationary by making the appropriate transformations. --- class: inverse, center, middle # Recap --- ## Summary of Main Points By now, you should be able to do the following: - Define a **stationary time series**. - Understand how plotting techniques can help identify non-stationarity. - Understand how we can **stabilize a non-stationary time series to achieve stationarity** (differencing to stabilize the mean, and log transformation to stabilize the variance). - Understand how to use **unit root tests** to test for stationarity. --- ## 📝 Review and Clarification 📝 1. **Class Notes**: Take some time to revisit your class notes for key insights and concepts. 2. **Zoom Recording**: The recording of today's class will be made available on Canvas approximately 3-4 hours after the session ends. 3. **Questions**: Please don't hesitate to ask for clarification on any topics discussed in class. It's crucial not to let questions accumulate.